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1.   THE  EVOLUTION  OF  ON-LINE  CHARACTER  RECOGNITION 

1.1  Introduction 

The  impetus  for  this  project  involving  machine  recognition  of  hand- 
printed characters  was  my  dissatisfaction  with  the  current  state  of  on- 
line character  recognition  machinery  and  algorithms,  as  described  in 
Chapter  12  of  Principles  of  Interactive  Computer  Graphics  [k] ,  by  Newman 
and  Sproull.   The  set  of  criteria  for  the  "perfect  recognizer,"  as  sug- 
gested by  Newman  and  Sproull,  is: 

(1)  responds  quickly; 

(2)  high  rate  of  success  of  recognition; 

(3)  tolerates  variation  in  size,  style,  and  orientation; 
(k)        uses  computer  resources  sparingly. 

The  authors  point  out  that  no  current  (1973)  recognizers  can  claim 
to  meet  all  these  criteria  perfectly. 

I  believe  that  the  hardware  and  software  which  I  will  propose  will 
meet  the  above  criteria  as  closely  as  any  on-line  character  recognition 
machinery  I  have  seen,  while  simultaneously  maintaining  a  modest  cost. 

1.2  History 

The  first  on-line  character  recognizers  used  graphic  tablets  and 
mini -computers  as  the  hardware,  and  had  very  inflexible  sortware  (the  user) 
could  not  train  the  machine  to  recognize  "his"  character,  but  rather  the 
machine  trained  the  user  to  adopt  a  predefined  style).   An  example  of 
just  such  a  system  is  discussed  by  Groner  [2], 


The  first  trainable  recognizer  was  developed  by  Teitelman  [5]j  others, 
including  Bernstein  [1],  soon  developed  trainable  programs  utilizing  the 
RAND  tablet.  While  progress  has  not  altogether  stopped  on  such  systems, 
the  number  of  references  to  recent  work  (after  1970)  is  surprisingly  and 
disappointingly  small. 

1.3  Essential  Characteristics  of  Recognizers 

1.3.1  Hardware 

To  recognize  the  alphabetic,  numeric,  and  special  characters  of,  say, 
the  ASCII  code,  as  well  as  other  special  characters  demanded  by  any  par- 
ticular system,  a  graphic  tablet  is  essential.   Previous  work  with  RAND  and 
Sylvania  tablets  indicates  that  a  large  portion  of  the  hardware  budget 
goes  into  the  tablet  itself.  My  proposal  is  to  substitute  a  voltage 
gradient  tablet  of  the  form  described  by  Newman  and  Sproull  [h]: 
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A  Voltage  Gradient  Tablet 
Figure  1 


The  advantages  of  this  choice  are  primarily  its  lower  cost,  secon- 
darily its  inherent  ability  to  be  fine-tuned  in  the  field,  and  thirdly  the 
small  amount  of  additional  (and  expensive)  hardware  which  must  be  added  to 
the  basic  tablet  to  realize  a  useful  input  device. 

In  addition  to  the  voltage  sources  shown  in  Figure  1,  the  system 
needs  simple  circuitry  to  recognize  an  interrupt  from  the  stylus  (making 
or  breaking  contact  with  the  tablet  surface),  an  A/D  converter,  an  inter- 
rupt flip-flop,  and  two  10-bit  X  and  Y  position  registers. 

The  necessary  computing  can  be  handled  entirely  by  a  microprocessor 
of  the  Intel  8008  variety  (197^-  prices  are  approximately  $100  for  the  chip 
alone,  or  $900  for  the  MCS-8,  which  includes  the  8008  chip,  all  other 
necessary  circuitry  and  3K  memory).   I  chose  this  particular  microcomputer 
because  I  was  familiar  with  its  capabilities  and  know  it  could  easily  be 
adapted  for  this  system. 


1.3.2  Software 


The  basic  requirements  of  the  software  system  are : 

(1)  read  the  tablet; 

(2)  extract  important  features  from  character; 

(3)  dictionary  lookup  routine; 

(h)  training  routine  to  build  dictionary. 

All  of  the  above  are  presently  in  my  simulator  program  ONLINE.   I 
am  convinced  that  the  entire  program  will  fit  with  room  to  spare  on  an 
MCS-8. 


1.3*3  Feature  Recognition 

The  crucial  part  in  the  design  of  the  recognizer  is  the  choice  of 
features  to  he  extracted  from  the  input  strokes.  A  good  recognizer  must 
discriminate  between  different  characters  while  recognizing  slightly  dif- 
ferent versions  of  the  same  character. 

The  most  popular  features  extracted  by  current  mechanisms  are  the 
regions  visited  by  each  stroke.  By  predefining  a  standard  rectangle,  or 
by  normalizing  the  stroke  to  some  standard  rectangle,  and  then  dividing 
that  rectangle  into  a  number  of  regions,  a  program  may  then  count  and 
record  the  number  of  times  a  stroke  crosses  a  region  boundary. 

This  is  exactly  the  scheme  used  by  both  Bernstein  [1]  and  Teitelman 
[5],  as  shown  in  Figure  2. 


stroke  stroke 


Stroke  Sequences  Encoded  by  Regions 
Figure  2 

After  collecting  a  stroke  sequence,  these  schemes  searched  a  tree- 
structured  dictionary,  yielding  the  character  to  be  recognized.   Ledeen  [3] 
developed  a  simple  recognizer  which  used  a  similar  feature  extraction  mech- 
anism, but  utilized  a  more  compact  dictionary  (about  IK  l6-bit  words). 


My  objections  to  these  known  schemes  are  three-fold: 

(1)  The  use  of  RAED-type  tablets  is  too  costly; 

(2)  I  disagree  in  theory  with  having  to  write  into  a  predefined  rec- 
tangle, using  predefined  region  boundaries.   The  alternative, 
"clipping"  to  the  boundaries  of  the  character  itself  and  then  sub- 
dividing into  regions ,  only  increases  the  amount  and  cost  of  pro- 
cessing power  and  memory  needed,  and  hence  is  equally  unsatis- 
factory. 

(3)  The  tree-structured  dictionary  uses  more  memory  than  necessary. 

If  the  dictionary  is  actually  a  tree,  with  nodes  and  pointers,  ap- 
proximately 2/3  of  the  memory  space  is  devoted  to  pointers  and 
only  1/3  to  data.   If  the  dictionary  is  a  binary  tree,  but  struc- 
tured as  an  array  [left  son  of  node  n  at  T(2n)  and  right  son  at 
T(2n+1)],  then  the  total  array  area  must  grow  in  proportion  to  the 
depth  of  the  tree,  i.e.,  the  longest  stroke  sequence.   Since  adding 
one  level  to  the  tree  will  double  the  array  area  required,  effec- 
tive memory  utilization  remains  a  problem. 
My  solutions  to  the  three  objections  above  are: 

(1)  Use  a  resistive  tablet.   This  relatively  low  cost  input  device 
contains  all  the  sophistication  necessary  for  character  recogni- 
tion; 

(2)  Make  the  recognition  process  independent  of  the  size  and  location 
of  the  character  drawn.   This  is  accomplished  by  a  new  definition 
of  the  "essential  features"  of  characters:   the  order  of  strokes 
and  their  direction,  independent  of  their  physical  location; 


(3)    Encode  the  stroke  sequence  and  orientation  into  one  code  number 
(explained  in  Algorithm)  and  look  up  the  code  number  in  a  sorted 
dictionary.   The  code  number  has  the  same  information  content  as 
the  tree  traversal  order,  without  the  memory  overhead. 
The  whole  scheme  is  made  viable  by  using  the  classical  approach  of 
having  an  initial  training  period  in  which  the  user  trains  the  machine 
to  recognize  his  own  writing  style  (in  fact,  the  user  may  teach  several 
stroke  sequences  for  the  same  character).   Following  the  training  period 
the  machine  runs  in  its  normal  recognition  mode.   Provision  is  made  for 
retraining  the  machine  when  another  user  takes  over. 

l.k     Advantages 

The  worth  of  the  solution  described  here  is  enhanced  by  its  direct 
applicability  to  a  microprocessor  environment,  its  sparing  use  of  costly 
memory,  and  the  small  amount  of  interface  hardware  necessary  between  the 
tablet  and  processor.   It  allows  the  user  to  train  the  machine  to  his 
own  writing  style.  Different  users  may  operate  the  machine  by  utilizing 
a  retraining  mode.   Furthermore,  the  entire  algorithm  is  programmable 
using  only  addition,  subtraction,  and  testing.  Wo  multiplications  or 
divisions  are  required. 

1.5  Disadvantages 

It  was  first  thought  that  the  requirement  of  distinct  strokes 
would  allow  recognition  of  only  block-style  printed  characters.  However, 
the  scheme  generalized  quite  nicely  to  the  normal  spectrum  of  printed 


characters,  with  only  a  very  few  unnatural  cases  caused  by  describing  two 
different  characters  by  the  same  stroke  pattern. 


1.6  Summary 


I  believe  that  the  "Weaver  method"  represents  an  elegant  solution 
to  an  otherwise  sticky  problem.   Like  its  predecessors,  this  method  fails 
to  fully  satisfy  all  of  Newman  and  Sproull's  basic  criteria  for  the  per- 
fect recognizer.   Yet,  it  satisfies  sufficiently  many  of  them  that  it 
would  perform  well  in  many  common  situations.   Its  most  significant  advan- 
tages are  its  low  cost  and  ease  of  implementation. 

Appendix  1  contains  a  list  of  the  alphanumeric  characters  and  their 
stroke  sequence  definitions  for  the  author's  style  of  writing.  Appendix 
2  is  a  listing  of  the  computer  program  which  simulates  the  hardware  and 
software  systems.   Appendix  3  is  a  sample  run  which  shows  the  training 
and  recognition  of  all  36  alphanumeric  characters.   Appendix  k   illus- 
strates  the  change  of  mode,  error  detection,  and  symbol  non-recognition. 


2.   THE  RECOGNITION  ALGORITHM 
2.1  The  ONLINE  Simulation  Program 

The  program  ONLINE  simulates  the  actions  of: 

(1)  A  resistive  tablet  and  stylus; 

(2)  The  hardware  interface  between  tablet  and  processor,  including 
X-  and  Y-position  10-bit  registers  and  1-bit  interrupt  flag; 

(3)  The  processor  (a  microprocessor,  such  as  the  Intel  8008) . 
This  simple  equipment  is  shown  to  be  sufficient  for  an  on-line 

character  recognition  system. 

2.1.1  Theory 

The  touching  of  the  stylus  to  the  tablet  generates  an  interrupt 
and  sets  an  interrupt  flag.  Upon  seeing  this  flag  the  processor  copies 
the  contents  of  the  position  registers  into  its  variables  X-,  and  Y  . 
The  lifting  of  the  stylus  from  the  tablet  also  generates  an  interrupt 
and  sets  an  interrupt  flag.   This  time  the  processor  copies  the  posi- 
tion registers  into  its  variables  Xp  and  Y_.  Now  by  simple  subtrac- 
tion and  testing,  the  processor  determines  a  four-bit  code  word  for 
each  line  (X,  ,Y   )_>  (X  ,Y  )  drawn. 

The  code  is  of  the  form   C,C  C  C.  , 


where  C,    is  1  only  if  the  line  was  drawn  from  top  to  bottom 


C    is  1  only  if  the  line  was  drawn  from  bottom  to  top, 

C-.   is  1  only  if  the  line  was  drawn  from  left  to  right, 

and   CV   is  1  only  if  the  line  was  drawn  from  right  to  left. 


9 

The  code  word  is  set  by  a  program  segment  similar  to  this: 

CI  =  C2  =  C3  =  Ck  =   0; 

if  (Y1-Y2)  >  EPSILON  then  CI  =  1; 

else  if  (Y2-Y1)  >  EPSILON  then  C2  =  1; 
if  (X2-X1)  >  EPSILON  then  C3  =  1; 

else  if  (X1-X2)  >  EPSILON  then  (&  =  lj 

2.1.2  Program  Input 

The  input  to  the  simulator  program  for  any  one  character  is  a  series 
of  "strokes"  which  represent  the  hardware  result  of  the  previous  algor- 
ithm.  For  instance,  the  letter  "A"  is  defined  by  three  strokes: 

stroke  number        stroke  orientation  stroke  encoding 

1  top  to  bottom  &  right  to  left         TB,RL/ 

2  top  to  bottom  &  left  to  right        TB,LR/ 

3  left  to  right  LR/ 

Each  stroke  is  encoded  by  using  the  obvious  2-character  mnemonic 
for  each  of  the  four  possible  directions,  with  a  "/"  used  to  indicate 
the  end  of  one  stroke.   Thus  the  full  description  of  "A"  is: 

A:    TB,RL/TB,LR/LR/ 

2.1.3  Stroke  Encoding 

From  this  description,  provided  as  a  character  string  on  an  input 
card  to  the  simulator,  the  processor  decodes  the  input  into  sequence  of 
k- bit  code  words,  one  code  word  per  stroke.  Now,  by  using  a  binary 
digit  weighting  scheme  on  the  4-bit  words,  a  unique  hexadecimal  (0-15) 
digit  results. 
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Note  that,  of  the  16  possible  codes,  only  8  are  used  (combinations 
like  top-to-bottom  and  bottom-to-top  are  clearly  impossible).   The  usable 
combinations  are  shown  in  Figure  3» 

C1  (TB)   C   (BT)   C   (LR)   C,  (RL)   Hexadecimal 

Equivalent 


0 

0 

0 

1 

1 

0 

0 

1 

0 

2 

0 

1 

0 

0 

k 

0 

1 

0 

1 

,5. 

0 

1 

1 

0 

6 

1 

0 

0 

0 

8 

1 

0 

0 

1 

9 

1 

0 

1 

0 

10 

Listing  of  h- bit  Code  Words 
Figure  3- 

Thus  any  sequence  of  strokes  is  reducible  to  a  series  of  hex  digits, 
"A"  is  encoded  as  9/10/2/. 

Since  10  is  the  largest  hex  digit  used, ' subtracting  one  from  each 
of  the  hex  equivalents  in  Figure  3  provides  a  corresponding  set  of  deci- 
mal digits  which  are  more  easily  handled.   The  decimal  digits  corres- 
ponding to  the  stroke  sequence  are  then  weighted  with  a  power  of  ten, 
proportional  to  their  positions  in  the  stroke  sequence,  to  yield  a  single 
decimal  number  as  the  representative  code  number.  Repeating  the  example 
of  the  example  of  the  character  "A", 
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A        TB,RL/TB,LR/LR/      9/10/2      8/9/1       891 
character      simulator  input     hex        decimal     code  number 

Clearly,  this  process  may  be  run  backwards  to  generate  the  defining 
stroke  sequence  which  generated  any  given  code  number. 

The  worst  case  for  the  length  of  a  stroke  sequence  for  common  al- 
phanumeric characters  appears  to  be  k   strokes.  Even  allowing  for  5 

5 
strokes,  the  corresponding  code  number  is  less  than  10  ,  and  so  remains 

well  within  the  range  of  microcomputer  arithmetic. 

While  the  above  scheme  works  quite  well,  generalizing  from  block 
characters  to  printed  characters  required  one  minor  modification  - 
recognition  of  the  "null"  stroke  generated  by  curves  which  begin  and  end 
within  EPSILON  of  the  same  point,  as  in  the  letter  "0",  the  number  "0", 
the  letter  "Q",  and  the  numbers  "6",  "8",  and  "9".   This  was  easily  ac- 
complished by  allowing  a  previously  unused  decimal  digit  (6)  to  mark  a 
"null"  stroke. 

2.1.4  The  Dictionary 

Now  that  the  input  stroke  sequence  has  been  encoded  into  a  few  bits, 
it  appears  to  be  best  managed  by  storing  the  decimal  code  numbers  and  the 
characters  which  they  represent  in  a  dictionary,  formed  from  two  linear 
arrays,  and  arranged  in  sorted  order  by  ascending  code  numbers.   The  be- 
ginning of  a  typical  dictionary  might  be: 
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071 

c 

Til 

F 

891 

A 

♦ 

• 
• 

A  binary  search  on  the  code  numbers  suffices  to  quickly  locate 
any  character  in  the  list.   In  recognition  mode  this  is  exactly  what 
happens;  in  training  mode  the  addition  of  a  new  code-number-and- 
character  combination  uses  the  binary  search  to  find  their  proper  posi- 
tion in  the  list.   The  list  is  then  shifted  to  make  a  hole,  and  the  new 
information  is  inserted.   The  efficiency  of  the  binary  search  is  well 
known;  for  a  list  of,  say,  6h   characters,  the  maximum  number  of  probes 
required  to  locate  any  character  is  given  by    log  6k  =  6. 

2.1.5  Simulator  Operation 


The  simulator  works  in  one  of  five  modes : 
Training:     A  character  is  presented  and  its  stroke  sequence  defined. 
Example:      A:   TB,RL/TB,LR/LR/ 
Z:    LR/TB,RL/LR/ 
Recognition:   Only  the  stroke  sequence  is  presented;  it  is  decoded  and 
its  associated  character  printed. 
Example :  TB/TB/LR/LR/ 

Restart:      Clears  all  currently  stored  information  in  preparation 
for  a  new  training  sequence. 
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Ignore:       In  the  event  of  invalid  commands,  they  are  ignored. 
Stop:        Program  terminates. 

2.1.5.1  Training  Mode  Detail 

The  first  character  of  the  input  card  contains  the  new  character 
to  be  learned.   The  remaining  characters  are  the  stroke  sequence 
mnemonics  which  the  user  wishes  to  associate  with  this  character.   The 
stroke  sequence  is  decoded  into  its  code  number  n,  the  current  (sorted) 
list  of  code  numbers  is  binary  searched  to  find  this  number's  proper 
location,  all  entries  whose  code  number  is  larger  than  n  are  moved  down, 
and  the  current  code  number  and  character  are  inserted. 

2.1.5.2  Recognition  Mode  Detail 

Each  input  card  contains  only  the  simulated  stroke  sequence.   The 
sequence  is  decoded  into  its  corresponding  code  number  and  the  current 
dictionary  is  searched.   If  the  code  number  is  found,  its  associated 
character  is  echoed  along  with  the  message  STROKE  SEQUENCE  RECOGNIZED 
AS  THE  CHARACTER  "*",  where  *  is  the  recognized  character.   If  the  code 
number  is  not  found,  the  message  "STROKE  SEQUENCE  NOT  RECOGNIZED.   TRY 
AGAIN"  appears.   If,  when  the  stroke  sequence  is  repeated,  it  is  still 
not  recognized,  the  message  "STILL  NOT  RECOGNIZED.   RETRAIN  FOR  THIS 
SYMBOL"  is  produced.   It  is  probable  that  the  user  has  now  either  changed 
his  defining  stroke  sequence  for  a  particular  character,  or  used  a  new 
symbol  unfamiliar  to  the  recognizer.   In  either  case,  the  recognizer 
should  return  to  training  mode  to  learn  a  new  stroke  sequence;  this  is 
speedily  accomplished  with  the  $TRAIN  command. 
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APPENDIX  1 


The  alphanumeric  characters  and  their  defining  stroke 
sequences,  as  determined  by  the  author's  style  of 
writing. 
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A 


B 


D 


E 


G 


H 


K 


A  POSSIBLE  STROKE  SEQUENCE  FOR  THE 
ALPHANUMERIC  CHARACTERS 
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CD 
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c 

— P 

91 

I 

— => 

1 

717 

— p 

I 

171 

J 

-> 

81 

I 

/ 

>» 

789 

u 

9 

/ 

\ 

/ 

\ 

8989 

N 


Q 


R 


T 


U 


W 


0 


<D 

<d    (D 

a) 

£5        <0 

-P   ^ 

a  xi 

■d  44 

-p  ^ 

ra   O 

o  o 

*H     o 

^2 

M   !h 

o    ^ 

•H     fH 

•H  -P 

CD  +3 

^  +3 

O  -P 

<+H     M 

CO     W 

-P     CO 

<H    w 

o 


I 

\ 


z» 


/ 


\  /      \       / 


/ 

\ 

— * 

1 

(5 

/ 

1 



I 


>/ 


H 

a  a) 

•H  CD    rQ 

CD  o    3 


T9T 

6 


IT 


69 
39 
8 
71 


2 


98 

9898 

89 

987 

181 

68 

78 

97 

188 


first 
stroke 

second 
stroke 

•H 
-P 

a) 
o 
-p 

CO 

5 

o 

0) 

o 

w 

decima; 

code 

number 

k 

^ 

4 

87 

5 

J 

^ 

— r» 

781 

6 

i 

(5 

76 

7 

— 5> 

/ 

18 

8 

<3 

<S 

66 

9 

d 

1 

61 

18 
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APPENDIX  2 


Source  code  listing  for  the  simulator  program  ONLINE, 
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/*  A  SIMULATOR  OF  ON-LINE  CHARACTER  RECOGNITION  */ 
ONLINE:  PROCEDURE  OPT  IONS < MA  IN ) ; 

/*  A  PROJECT  FOR  CS  397,  INTERACTIVE  COMPUTER  GRAPHICS  (C.  W.  GEAR) 
PROGRAMMED  BY  ALFRED  C.  WEAVER 
DATE:   MAY  4,  1974 

THIS  PROGRAM  SIMULATES  A  MICROPROCESSOR  SUCH  AS  THE  INTEL  8008 
WITH  A  VOLTAGE  GRADIENT  (RESISTIVE!  TABLET  AS  THE  INPUT  DEVICE. 
THE  ALGORITHM  RECOGNIZES  HAND-WRITTEN  CHARACTERS  BY  EXTRACTING 
ESSENTIAL  FEATURES  (STROKE  SEQUENCE  AND  ORIENTATION)  FROM  THE  INPUT 
STROKE  SEQUENCES  APE  ENCODED  AS  A  DECIMAL  CODE  NUMBER 
WHICH  SERVES  AS  REFERENCE  IN  A  SORTED,  LIST-STRUCTURED  DICTIONARY. 

THE  RECOGNIZER  OPEPATES  IN  BOTH  "TRAINING"  AND  "RECOGNITION" 
MODES.    WHILE  IN  TRAINING  THE  RECOGNIZER  ENCODES  STROKE 
SEQUENCES  AND  BUILDS  A  DICTIONARY.    WHEN  RECOGNIZING 
THE  INPUT  STROKES  ARE  ENCODED  AND  THE  DICTIONARY  SEARCHED 
FOP  THE  CORRESPONDING  CHARACTER.    PROVISION  IS  MADE  WITH 
THE  "RESTART"  MODE  TO  CLEAR  MEMORY  AND  BEGIN  TRAINING  BY 
A  DIFFERENT  USER. 

THE  ALGORITHM  IS  DESIGNED  FOR  A  MICROPROCESSOR  ENVIRONMENT 
WITH  LITTLE  ADDITIONAL  INTERFACE  HARDWARE.    THE  PRIMARY 
ADVANTAGES  ARE  LOW  COST,  SIMPLICITY  OF  PROGRAMMING,  AND 
EASE  OF  USE.    */ 

/*  PREPARE  THE  MICROPROCESSOR  DATA  AREA  */ 

DCL   RCHAR(l:64)  CHAR(l),        /*  LIST  OF  RECOGNIZED  CHARACTERS  */ 
NEWCHAR  CHAR(l),  /*  NEW  CHARACTER  IN  TRAINING  MODE  */ 

CARD  CHAR180)  VAR ,  /*  INPUT  CARD  IMAGE  */ 

FOUND  8IT<1),  /*  LOCATE  FLAG  */ 

(POINTER,  /*  LOCATION  POINTER  SET  BY  'LOOKUP'  */ 

#ENT,  /*  CURRENT  NUMBER  OF  TABLE  ENTRIES  */ 

CODE*,  /*  ENCODED  STROKE  SEQUENCE  */ 

NOTFOUND,  /*  CGUNT  OF  INPUT  ERRORS  */ 

LIST(l:64),  /*  LIST  OF  RECOGNIZED  CODE  WORDS  */ 

MODE)  FIXED  BINARYOl);    /*  MODE  OF  OPERATION: 

=0    IGNORE 
=1    TRAIN 
=2    RECOGNIZE 
=3    STOP  */ 
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/*  INITIALLY  CLEAR  THE  DATA  AREA  */ 
CALL  RESTART; 

/*  REPEAT  UNTIL  $STOP  COMMAND  IS  PROCESSED  */ 
DO  WHILE  (MODE  -»=    3  /*  STOP  */); 

/*  READ  AND  PRINT  INPUT  CARD  */ 
GET  EDIT  (CARD)  (COL(l),  A(30)); 
PUT  SKIP(2)  EDIT  (CARD)  (AJ; 

/*  IF  COLUMN  1  IS  $,  THEN  CARD  IS  A  COMMAND  */ 
IF  SUBSTR(CARO,l,l>  =  •$•  THEN 

/*  SET  PROPER  MODE  */ 

IF  CARD  =  'STRAIN1  THEN  MODE  =  1; 

ELSE  IF  CARD  =  'SRECOGNIZE*  THEN  MODE  =  2; 
ELSE  IF  CARD  =  f$STOPf  THEN  MODE  =  3; 

ELSE  IF  CARD  =  •SRESTART*  THEM  CALL  RESTART; 
ELSE  MODE  =0; 

/*  OTHERWISE,  CARD  IS  AN  INPUT  STROKE  SEQUENCE  */ 
ELSE  DO; 

/*  DECODE  THE  STROKE  SEQUENCE  */ 

CALL  DECODE; 

/*  FIND  CGDE  WORD  IN  TABLE  */ 

CALL  LOOKUP; 

/*  WHICH  MODE  ARE  WE  IN?  */ 

IF  MODE  -    1  THEN  /*  IN  TRAINING  MODE  */  CALL  INSERT; 

ELSE  /*  IN  RECOGNITION  MODE  */  DC; 

IF  FOUND  /*  ALREADY  IN  DICTIONARY  */  THEN  PUT  EDIT 
(•STROKE  SEQUENCE  RECOGNIZED  AS  THE  CHARACTER  "•, 

RCHAR(PDINTER),  »•••)  (   3  A); 
ELSE  /*  NOT  IN  DICTIONARY  */  DO; 

IF  NOTFOUND  =  1  /*  FIRST  TIME  */  THEN  PUT  EDIT 
(•CHARACTER  NOT  RECCGNIZED.  TRY  AGAIN1)  (SKIP,  A); 
ELSE  /*  SECOND  TIME  NOT  FOUND  */  PUT  EDIT 
(•STILL  NOT  RECOGNIZED,  RETRAIN  FOR  THIS  SYMBOL1) 
(SKIP,  A) ; 
END; 
END; 
END; 
END; 

PUT  SKIP(3)  EDIT  (fEND  OF  PROGRAM')  (A); 
RETURN; 


RESTART:  PROCEDURE; 

/*  A  PROCEDURE  TO  CLEAR  THE  DATA  AREA  */ 

DECLARE  I  FIXED  BINARY  (31); 

POINTER,  *ENT,  CODE*,  MODE,  NOTFCUND  =  0 

DO  I  =  1  TO  64; 

LIST(I)  *  o; 

RCHAR(I)  '    •  •; 

END; 


22 
END  RESTART; 


LOOKUP:  PROCEDURE; 

/*  A  PROCEDURE  TO  FIND  THE  LOCATION  OF  •CODE*'  IN  'LIST1  */ 

DECLARE  (HEAD  INIT(O),  TAIL,  MID,  *2  INITC2I)  FIXED  BINARY  (31); 

TAIL  =  *ENT  +  1; 

FOUND  =  'O'B; 

/*  STANDARD  BINARY  SEARCH  */ 
DO  WHILE  (TAIL-HEAD  >  1); 
MID  =  (HEAD+TAIL)/*2; 

IF  CODE*  <  LIST(MID)  THEN  TAIL  =  MID; 
ELSE  IF  CODE*  >  LIST(MID)  THEN  HEAD  =  MID? 
ELSE  DO; 

FOUND  =  ■ 1»B; 
NOTFCUND  =  0; 
POINTER  =  MID; 
RETURN; 
END; 
END; 
POINTER  =  HEAD; 
NOTFOUND  =  NOTFOUND  *    i; 
END  LOCKUP; 


INSERT:  PROCEDURE; 

/*  A  PROCEDURE  TO  INSERT  •CODE*1  AND  »NEWCHAR»  INTO  DICTIONARY  */ 

DECLARE  I  FIXED  BINARY  (31); 

IF  FOUND  THEN  RETURN; 

POINTER  =  POINTER  «-  1; 

DO  I  =  *ENT  TO  POINTER  BY  -1; 

LISTU  +  li  =  LIST(I); 

RCHARU  +  1)  =  RCHARUI; 

END; 
LIST(POINTER)  =  CODE*; 
RCHAR(POINTER)  =  NEWCHAR; 
*ENT  =  *ENT  +    1; 
END  INSERT; 

DECODE:  PROCEDURE; 

/*  A  PROCEDURE  TO  READ  STROKE  SEQUENCE  AND  PRODUCE  'CODE*'  */ 

DECLARE  FIELD  CHAR(IO)  VAR,  (I,C)  FIXED  BINARY(3i); 

/*  IF  IN  TRAINING  MODE,  EXTRACT  NEW  CHARACTER  */ 

IF  MODE  =  1  THEN  DO; 

NEWCHAR  =  SUBSTR(CARD,  1,1); 

CARD  =  SUBSTR(CARD,3); 

end; 
code*  =  o; 

/*  REPEAT  FOR  EACH  STROKE  */ 
DO  WHILE  (CARD  -=  •  •); 

I  =  INDEX(CARD,  •/•); 

/*  CATCH  INPUT  FORMAT  ERRORS  */ 

IF  I  =  0  THEN  DO; 
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/*  STOP  ON  INPUT  FORMAT  ERROR  */ 

PUT  SKIP  EDIT  ('INPUT  FORMAT  ERROR')  (A); 

STOP; 

END; 
ELSE  DO; 

FIELD  =  SUBSTRCCARD,  1,  II; 

CARD  =  SUBSTRtCARD,  I  +1 J ; 

C  =  0; 

IF    INDEX(FIELD,     «TBM>0    THEN    C=C  +  8; 

IF    INDEX(FIELD,     «BT«)>0    THEN    C=04; 

IF    INDEXtFIELD,     «LR«)>0    THEN    C  =  C«-2; 

IF    INDEX(FIELD,     •RL,)>0    THEN    C=C+1; 

IF    C    =    0    THEN    C    =    7; 

/*  IN  THE  MICROPROCESSOR  CODE  THIS  MULTIPLICATION  MAY  BE 
REPLACED  BY  ADDITIONS    */ 

CODE*  =  CODE*  *  10  *■   C  -  1; 

END; 
END; 
END  DECODE; 


END  ONLINE; 
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APPENDIX  3 

Sample  output  from  ONLINE  showing  training  and  recognition 
for  the  36  aphanumeric  characters. 
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STRAIN 


A: 

TB,RL/TB,LR/LR/ 

6: 

TB/TB/TB/ 

C: 

TB/ 

D: 

TB/TB/ 

E: 

T8/LR/LR/LP/ 

F: 

TB/LR/LR/ 

G: 

TB,LR/LR/ 

H: 

TB/LR/TB/ 

I: 

LP/TB/LR/ 

J: 

TP,RL/LR/ 

K: 

TB/TB, RL/TB,LR/ 

L: 

TB,LR/ 

MS 

TB,PL/TB,LP/TB,RL/TB,LR/ 

N: 

TB/TB, LR/TB/ 

0: 

/ 

P: 

BT/ 

Q: 

/TB,LR/ 

R: 

BT/TR,LR/ 

S: 

TB,RL/ 

T: 

TB/LR/ 

U: 

LR/ 

V: 

TB, LR/TB, PL/ 

w: 

TB, LR/TB, RL/T B, LR/TB, RL/ 

X: 

TB,RL/TB,LP/ 

Y: 

TB,LR/T3,P'L/TB/ 

Z: 

LP/TB,RL/LP/ 

0: 

/TB,RL/ 

l: 

TB/TB, RL/ 
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2:    TB,LR/T6/ 

3:    LR/TB,RL/TR,PL/ 

4:    TB.RL.TB/ 

5:  TB/TR,RL/LR/ 

6:    TR// 

7:     LR/TB,RL/ 

8:    // 

9:    /TB/ 

SRECOGNI ZE 

TB,RL/TB,I_R/LR/ 

TB/TB/TB/ 

TB/ 

TB/TB/ 

TB/LR/LR/LR/ 

TB/LP/LR/ 

TB,LR/LP/ 

TB/LR/TB/ 

LR/TB/LP/ 

TB,PL/LR/ 

TB/TR,RL/TB,LR/ 

TB,LR/ 

TB,RL/TB,LR/TS,RL/TB,I_R/ 

TB/TB, LR/TB/ 

/ 

BT/ 

/TB,LR/ 

BT/TB,LP/ 

TB.RL/ 

TB/LR/ 

LR/ 


STROKE    SEQUENCE  RECOGNIZED    AS    THE    CHARACTER  MAM 

STROKE    SEQUENCE  RECOGNIZED    AS    THE    CHARACTER  HBH 

STROKE    SEQUENCE  RECOGNIZED    AS    THE    CHARACTER  "C" 

STROKE    SEQUENCE  RECOGNIZED    AS    THE    CHARACTER  "D" 

STROKE    SEQUENCE  RECOGNIZED    AS    THE    CHARACTER  MEM 

STROKE    SEQUENCE  RECOGNIZED    AS    THE    CHARACTER  «'FM 

STROKE    SEQUENCE  RECOGNIZED    AS    THE    CHARACTER  MG" 

STROKE    SEQUENCE  RECOGNIZED    AS    THE    CHARACTER  "HM 

STROKE    SEQUENCE  RECOGNIZED    AS    THE    CHARACTER  "I" 

STROKE    SEQUENCE  RECOGNIZED    AS    THE    CHARACTER  WJ" 

STROKE    SEQUENCE  RECOGNIZED    AS    THE    CHARACTER  MK» 

STROKE    SEQUENCE  RECOGNIZED    AS    THE    CHARACTER  «L" 

STRCKE    SEQUENCE  RECOGNIZED    AS    THE    CHARACTER  MMW 

STROKE    SEQUENCE  RECOGNIZED    AS    THE    CHARACTER  »N" 

STROKE    SEQUENCE  RECOGNIZED    AS    THE    CHARACTER  "0* 

STROKE    SEQUENCE  RECOGNIZED    AS    THE    CHARACTER  "PH 

STROKE    SEQUENCE  RECOGNIZED    AS    THE    CHARACTER  "Q" 

STROKE    SEQUENCE  RECOGNIZED    AS    THE    CHARACTER  "R" 

STROKE    SEQUENCE  RECOGNIZED    AS    THE    CHARACTER  "S" 

STROKE    SEQUENCE  RECOGNIZED    AS    THE    CHARACTER  WT" 

STROKE    SEQUENCE  RECOGNIZED    AS    THE    CHARACTER  "U" 
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TB,LR/TB,RL/  STROKE  SEQUENCE    RECOGNIZED  AS  THE  CHARACTER  "V" 

TB,LR/TB,RL/TB, LR/TB,RL/  STROKE  SEQUENCE    RECOGNIZED  AS  THE  CHARACTER  "W" 

TB,PL/TBtLP/  STROKE  SEQUENCE    RECOGNIZED  AS  THE  CHARACTER  "X" 

TB,LR/TB,RL/TB/  STROKE  SEQUENCE    RECOGNIZED  AS  THE  CHARACTER  "V" 

LR/TBfRL/LR/  STROKE  SEGUENCE    RECOGNIZED  AS  THE  CHARACTER  "Z" 

/TB,RL/  STROKE  SEQUENCE    RECCGNIZED  AS  THE  CHARACTER  "0" 

TB/TB,RL/  STROKE  SEQUENCE    RECOGNIZED  AS  THE  CHARACTER  "1" 

TB,LR/TB/  STROKE  SEQUENCE    RECOGNIZED  AS  THE  CHARACTER  "2" 

LR/TB,RL/TB,PL/  STROKE  SEQUENCE    RECOGNIZED  AS  THE  CHARACTER  "3" 

TB,PL.TB/  STROKE  SEQUENCE    RECOGNIZED  AS  THE  CHARACTER  "S" 

TB/TB,RL/LR/  STROKE  SEQUfcNCE    RECOGNIZED  AS  THE  CHARACTER  W5M 

TB//  STROKE  SEQUENCE    RECOGNIZED  AS  THE  CHARACTER  M6" 

LR/TB,RL/  STROKE  SEQUENCE    RECOGNIZED  AS  THE  CHARACTER  " 7" 

//  STRCKE  SEQUENCE    RECOGNIZED  AS  THE  CHARACTER  "8M 

/TB/  STROKE  SEQUENCE    RECCGNIZED  AS  THE  CHARACTER  "9" 
$STOP 

END    OF     PROGRAM 


28 


APPENDIX  k 

Sample  output  from  ONLINE  showing  change  of  mode, 
error  detection,  and  symbol  non-recognition. 
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SRECOGNIZE 

TB/BT/T8/ 
CHARACTER    NOT    RECOGNIZED.    TRY    AGAIN 

STRAIN 

A:    TB,RL/T8,LR/LR/ 

B:  TP/TB/TB/ 

C:  TB/ 

D:  TB/T8/ 

E:    TB/LR/LR/LP/ 

F:    TB/LR/LR/ 

SPECCGNIZE 

TB/LF/LP/LR/  STROKE    SEQUENCE    RECOGNIZED    AS    THE    CHARACTER    ME" 

TB,PL/TB,LP/LR/  STROKE    SEQUENCE    RECOGNIZED    AS    THE    CHARACTER     "A" 

TB/TB/LR/LP/ 
CHARACTER  NCT  RECCGNIZED.  TRY  AGAIN 

TB/TB/LR/LR/ 
STILL  NOT  RECCGNIZED.  RETRAIN  FOR  THIS  SYMBOL 

STRAIN 

*:    T8/TB/LP/LF/ 

SPECCGNI ZE 

TB/TP/LR/LR/  STROKE    SEQUENCE    RECOGNIZED    AS    THE    CHARACTER    »#" 

SPESTAPT 

STRAIN 

0:  /TBtPL/ 

l:  TB/TR,RL/ 

2:  TB.LP/TP/ 

3:  LP/TR,RL/TB,RL/ 

<*:  TR.RL/TR/ 

5:  TR/TB,RL/LR/ 

6:  TB// 

7:  LR/TB,RL/ 
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8:       // 
9:       /tp/ 

SRECOGNI ZE 

T8tLF/TB/ 

TB// 

/TB/ 

$BADCCMMAN0 

SBADCCMMAND 

SREC0GNI7E 

TB,LF,TBfRL 
INPUT    FORMAT    ERROR 


STROKE  SEQUENCE  RECOGNIZED  AS  THE  CHARACTER  "2" 
STROKE  SEQUENCE  RECOGNIZED  AS  THE  CHARACTER  M6" 
STROKE    SEQUENCE    RECOGNIZED    AS    THE    CHARACTER    "9" 
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